Signatures of selection and drivers for novel mutation on transmission-blocking vaccine candidate Pfs25 gene in western Kenya

Background Leading transmission-blocking vaccine candidates such as Plasmodium falciparum surface protein 25 (Pfs25 gene) may undergo antigenic alterations which may render them ineffective or allele-specific. This study examines the level of genetic diversity, signature of selection and drivers of Pfs25 polymorphisms of parasites population in regions of western Kenya with varying malaria transmission intensities. Methods Dry blood spots (DBS) were collected in 2018 and 2019 from febrile outpatients with malaria at health facilities in malaria-endemic areas of Homa Bay, Kisumu (Chulaimbo) and the epidemic-prone highland area of Kisii. Parasites DNA were extracted from DBS using Chelex method. Species identification was performed using real-time PCR. The 460 base pairs (domains 1–4) of the Pfs25 were amplified and sequenced for a total of 180 P. falciparum-infected blood samples. Results Nine of ten polymorphic sites were identified for the first time. Overall, Pfs25 exhibited low nucleotide diversity (0.04×10−2) and low mutation frequencies (1.3% to 7.7%). Chulaimbo had the highest frequency (15.4%) of mutated sites followed by Kisii (6.7%) and Homa Bay (5.1%). Neutrality tests of Pfs25 variations showed significant negative values of Tajima’s D (-2.15, p<0.01) and Fu’s F (-10.91, p<0.001) statistics tests. Three loci pairs (123, 372), (364, 428) and (390, 394) were detected to be under linkage disequilibrium and none had history of recombination. These results suggested that purifying selection and inbreeding might be the drivers of the observed variation in Pfs25. Conclusion Given the low level of nucleotide diversity, it is unlikely that a Pfs25 antigen-based vaccine would be affected by antigenic variations. However, continued monitoring of Pfs25 immunogenic domain 3 for possible variants that might impact vaccine antibody binding is warranted.


Background
Leading transmission-blocking vaccine candidates such as Plasmodium falciparum surface protein 25 (Pfs25 gene) may undergo antigenic alterations which may render them ineffective or allele-specific. This study examines the level of genetic diversity, signature of selection and drivers of Pfs25 polymorphisms of parasites population in regions of western Kenya with varying malaria transmission intensities.

Methods
Dry blood spots (DBS) were collected in 2018 and 2019 from febrile outpatients with malaria at health facilities in malaria-endemic areas of Homa Bay, Kisumu (Chulaimbo) and the epidemic-prone highland area of Kisii. Parasites DNA were extracted from DBS using Chelex method. Species identification was performed using real-time PCR. The 460 base pairs (domains 1-4) of the Pfs25 were amplified and sequenced for a total of 180 P. falciparuminfected blood samples.

Results
Nine of ten polymorphic sites were identified for the first time. Overall, Pfs25 exhibited low nucleotide diversity (0.04×10 −2 ) and low mutation frequencies (1.3% to 7.7%). Chulaimbo had the highest frequency (15.4%) of mutated sites followed by Kisii (6.7%) and Homa Bay

Introduction
Over the past decade remarkable progress has been made in reducing the global malaria burden by coordinated public health interventions targeting both vectors and parasites [1]. However, progress in malaria control has been stalled partly due to the spread of insecticide and antimalarial drug resistance [1,2]. One of the interventions needed to reduce the malaria burden is an effective P. falciparum vaccine [3][4][5]. Barriers to advances in vaccine development include allelic variation of target antigens due to mutations in their immunodominant protein domains [6][7][8][9][10]. Currently, there are attempts towards developing transmission-blocking vaccines (TBVs) that target P. falciparum sexual stage parasites [5,11]. TBVs elicit host antibody responses that block the sporogonic cycle of the parasite in the mosquito vector, thereby thus reducing and, ultimately, stopping ongoing transmission in local communities. Among the various TBVs under development a vaccine based on the Pfs25 protein sequence is perhaps the most advanced [12][13][14][15]. The cysteine-rich 25 kilo-Dalton molecule is expressed in the protease-rich mosquito vector midgut post-fertilization, where it facilitates ookinete epithelium penetration, aggregation, and maturation to oocysts [16,17]. Protein expression of the Pfs25 protein is translationally repressed in the human host by the Pumilio/ FBF family RNA-binding protein [18,19]. The Pfs25 protein consists of four tandem epidermal growth factor (EGF) domains with 22 cysteine residues that is anchored on the parasite surface by a C-terminal glycosylphosphatidylinositol (GPI) [20]. Recombinant Pfs25 antigen in combination with other TBV antigens [12,21] or alone have been shown to elicit anti-Pfs25 antibodies [13,[22][23][24][25][26][27] that block oocyst development in standard membrane feeding assays.
Notably, translational repression of Pfs25 protein expression in the human host tempers enthusiasm for field deployment of a TBV based on this antigen alone due to the absence of natural boosting in malaria endemic communities [25,27].
The Pfs25 exon is considered to be highly conserved despite the existence of various sites with mutations [28][29][30]. Documented mutations include locus 392 in P. falciparum isolates from China and India and locus 428 of isolates from Cambodia and India [29][30][31]. The aim of this study was to determine the distribution of polymorphic sites, level of genetic diversity and identify possible signatures of selection in the four domains of the Pfs25 protein from P. falciparum parasites circulating in malaria-endemic and epidemic-prone regions of western Kenya.

Study site and sampling
A cross-sectional study was conducted in 2018 and 2019 in health clinics in malaria-endemic areas of Homa Bay and Kisumu (Chulaimbo) Counties and the malaria epidemic-prone highlands of Kisii County (Fig 1). A total of 302 patients presenting with fever and symptoms of uncomplicated malaria, e.g., myalgia, fatigue, non-localizing symptoms, were recruited to donate a finger prick blood sample. 138 patients were from Homa Bay County, 62 from Kisumu County (Chulaimbo) and 102 from Kisii County. Health facilities selected for sample collection in Homa Bay lie at 34.64190˚E-0.38000˚S and 1134-1330 metres above the sea level (asl), in Kisumu (Chulaimbo Sub County hospital) at 00.03572˚S-034.62196˚E and 1328-1458 m asl and in Kisii (Eramba health dispensary) at 34˚48 0 E, 00˚35 0 S; 1540-1740 m asl (Fig 1).
Four samples of 25 μL of blood were collected on coded Whatman™ Blood Stain Cards (GE Healthcare WB100014) containing details such as the patient's study code, date of collection, age, and gender as previously described [32]. Each of the collected coded Whatman™ Blood-Stained Card was singly placed in individual plastic bags containing silica gel before being enclosed together in a labelled envelope and transported to the International Centre of Excellence for Malaria Research (ICEMR) laboratory located at Tom Mboya University College in Homa Bay town for storage at -20˚C and later molecular processing.

DNA extraction
Genomic DNA was extracted from the 302 DBS samples using the Chelex resin (Chelex -100) method following the published method [33] with modifications. Briefly, a 3mm ring of blotted filter paper containing a blood sample was punched into a sterile coded 1.5ml Eppendorf tube using a sterile craft store puncher. 950μl of 1×PBS and 50μl of 10% saponin was added and mixed well before incubation at 4˚C overnight. The mixture was centrifuged at 12,000 rpm for 10 min at room temperature before discarding the supernatant content. Any remnant of saponin was washed by adding 1ml PBS and spinning at 12,000 rpm for 5 min. The PBS was discarded and the tube content spun for 15sec before removal of the liquid component using the P200 pipette. The Eppendorf tubes containing filter paper were air-dried at room temperature for 15min. 250μl (20%) of Chelex suspension was added and the mixture was incubated in a water bath maintained at 85˚C for 10min. During incubation, the mixture containing DNA was vortexed for 5min to suspend the extracted DNA. Upon completion, the mixture was spun at 12,000 rpm for 1 min and the DNA was transferred into a coded clean 0.5ml Eppendorf tube and stored at -20˚C. Cultured P. falciparum laboratory strain NF54 genomic DNA was also extracted and used as a positive control for PCR and sequencing.

Detection of Plasmodium falciparum infections
Real-time (RT-PCR) with species-specific 18s rRNA and probes were used to identify P. falciparum infections as previously described [34]. Briefly, RT-PCR was set with a final volume of 12μl containing 2μl of sample DNA, 6μl of PerfeCTa1 qPCR ToughMix™, Low ROX™ Master mix (2X), 0.5μl of the species-specific probe, 0.4μl of the species-specific forward primers (10μM), 0.4μl of the species-specific reverse primers (10μM) and 0.1μl of double-distilled water. The thermal profile used was set at 50˚C for 2 min, followed by 45 cycles of (95˚C for 2 min, 95˚C for 3 sec and 58˚C for 30 sec). Three positive samples from laboratory strain and three negative samples from blank filter paper were used as a positive and negative control for RT-PCR respectively. The RT-PCR amplification was performed on QuantStudio 3 Real-Time PCR System (ThermoFisher, Carlsbad, CA, USA). A total of 180 samples were confirmed positive for P. falciparum by RT-PCR. The level of parasitemia ranged from 120 to 146,880 parasites/μL blood with a median of 4,920 parasites/μL.

Pfs25 gene amplification and sequencing
A total of 180 P. falciparum infected samples (78 from Homa Bay, 40 from Chulaimbo, and 62 from Kisii) confirmed positive by RT-PCR were used to amplify the target Pfs25 gene. Amplification of the 460bp exon was performed through nested PCR in T100™ Thermal Cycler (Bio-Rad) as previously described [35]. Briefly, for Nest I PCR, a final reaction volume of 13μL was prepared by addition of 3μL of parasite DNA to a mixture containing 5.75μL of DreamTaq Green PCR Master Mix (2X), 0.5μL of Nest I forward primer (10μM), 0.5μL of Nest I reverse primer (10μM) and 3.25μL of double-distilled water. For Nest II, a final reaction volume of 23μL was prepared by addition of 3μL of Nest I amplicon to a mixture containing 11.5μl of DreamTaq Green PCR Master Mix (2X), 0.5μL of Nest I forward primer (10μM), 0.5μL of Nest I reverse primer (10μM) and 7.5μL of double-distilled water. Nested I and II PCR conditions were set as follows: initial denaturation at 94˚C for 3 minutes, 34 cycles of denaturation at 94˚C for 30 sec, annealing at 50˚C for 30 sec, primer extension at 68˚C for 1 min and final extension at 72˚C for 6 min. PCR amplification was performed on T100™ Thermal Cycler (Bio-Rad, Hercules, CA, USA). The quality of the 460bp amplicons from Nested II PCR were assessed by performing gel electrophoresis in 1.5% w/v agarose gel after amplification and before sequencing. Each of the 180 PCR products was purified by the addition of Exonuclease I and Shrimp Alkaline Phosphatase (ExoSAP-IT) and incubated at 37˚C for 15 min. The Exo-SAP-IT in purified PCR products was inactivated by heating at 80˚C for 15 min. The cleaned PCR products with quantity above 25ng/μl were selected for sequencing after testing 1μl of each using NanoDrop™ Lite Spectrophotometer (Thermo Scientific™). Bi-directionally sequencing was done using BigDye 1 Terminator v3.1 Sequencing Standard kit on ABI PRISM 1 3700 DNA Analyzer (Applied Biosystems, Foster City, CA, USA). Repeat sequencing was done on samples with poor reads to confirm mutated sites.

Ethics approval
The study was approved by the Maseno University Ethics Review Committee (MUERC protocol No. 00456) and the University of California, Irvine Institutional Review Board (HS#2017-3512) and received authorization from the Ministry of Health. All volunteers or their guardians gave written informed consent to participate in providing blood samples.

Data analysis
The sequence assembly was done using Geneious version 11.1.5 software. Multiple sequence alignment was done using ClustalW, a matrix-based algorithm that is in build in Mega X software [36]. Segregating sites, mutated codons, and type of mutation were determined by inputting sequences from each region and reference sequence NF54 (Accession number X07802.1) in CodonCode Aligner version 9.0.1 (CodonCode Corporation, www.codoncode.com). Mixed haplotype infections were defined as that sample had sequencing chromatogram with double peaks (second peak higher that 50% of first peak). The PHASE function of DnaSP Version 6.12.03 was used to infer the haplotypes of mixed infections. The Pfs25 antigen structural delineation was done by inputting full-length sequence (X07802.1) on Protein Homology/analogY Recognition Engine (PHYRE2) version 2.0. The generated Pfs25 model by Phyre2 was visualized, modified and mutated sites marked in UCSF Chimera version 1.15 [37]. The genetic diversity indices; nucleotide diversity (π) [38], the mean number of pairwise difference (k) [39], number of segregating sites (S), number of haplotypes (h), haplotype diversity (Hd) were computed per region from the aligned sequences using DnaSP software [40] and results confirmed using Arlequin version 3.5.2 [41]. Haplotype data generated from DnaSP in nexus format with slight modification was used to draw a haplotype network using Population Analysis with Reticulate Trees (Popart) version 1.7 software [42]. The ratio of nonsynonymous to synonymous mutations (d N /d S ) was computed to give a hint on selection pressure acting on Pfs25 antigen from this region [43][44][45][46]. Presence and type of natural selection or deviation from the standard neutral model were determined through computation of neutrality tests such Tajima

Genetic diversity indices of Pfs25 gene
Relatively low nucleotide diversity (π) was observed among the sequences analysed. The 177 Pfs25 sequences had π of 0.04×10 −2 , k of 0.16 and a nucleotide conservation index of 97.7% that slightly varied per study site ( Table 2). Chulaimbo isolates had the highest observed π (0.07×10 −2 ) and Hd (0.15) followed by Kisii (π = 0.03 and Hd = 0.10) then Homa Bay (π = 0.02 and Hd = 0.08). There was no significant difference (p = 0.400) when the population pairwise FSTs were computed for Homa Bay and Kisii sequences (FST = 0.00). Similar results (p = 0.160) was observed for population differences between Chulaimbo and Kisii (FST = 0.037). No significant difference (p = 0.110) in observed allelic variation between Homa Bay and Chulaimbo sequences (FST = 0.039). The highest diversity index in Chulaimbo was due to a high frequency of T130T and V132I codon changes ( Table 3). Homa Bay and Kisii had the highest number of Pfs25 haplotypes (four each) compared to Chulaimbo (Fig 3). Among analysed samples from Chulaimbo, blood samples of patients from the same household were found to harbour different haplotypes. Six blood samples, 3 from each different household had either Hap1 with no mutated sites or Hap5 with both mutated codon T130T and V132I. This was however not observed in sequences from Homa Bay and Kisii.

Signature of selection and other mutation drivers on Pfs25 gene
The computed ratio of nonsynonymous and synonymous substitutions (d N /d S ) for all Pfs25 sequences from western Kenya was 5.37 ( Table 1). The d N /d S ratio was not only greater than 1 Table 1 50 were also significant (p<0.05). The selection pressure exacted its effect on loci existing between 98-148bp and 323-373bp in Pfs25 sequences from Homa Bay (Fig 4). These loci harbour the observed  segregating sites in Homa Bay sequences. Among sequences from Chulaimbo parasites, Tajima's D and Fu's F S were -0.73 and 0.8, respectively. All these test results were not significant (p>0.05), with FLD � and FLF � values of 0.77 and 0.37 respectively. Deviation from the standard neutral model among the Chulaimbo samples was observed on loci existing between 348-423bp where segregating sites were observed (Fig 4). Sequences from Kisii had significant (p<0.05) Tajima's D (-1.84) and Fu's F S (-3.9) values. The FLD � and FLF � results were -3.54 (p<0.02) and -3.52 (p<0.02) respectively. Purging selection on loci blocks ranging from 223-469bp was observed for these sequences (Fig 4).

Segregating sites Domain Allelic frequency Substituted bases Type of Substitution Codon change Type of mutation Homa Bay n (%) Chulaimbo n (%) Kisii n (%)
To further assess the presence or history of drivers shaping allelic patterns within the four domains of the Pfs25 antigen, we assessed a pair of loci that were under LD as well as having a history of recombination events (Rm). Three pairs of loci (123, 372), (364, 428) and (390, 394) within polymorphic Pfs25 sequences were observed to be under LD ( Table 3). The three loci pairs had highly significant (P<0.001) LD values. The presence of significant LD values is an indicator of inbreeding as a driver for the observed sequence changes in each study site. None of mutated Pfs25 sequences from western Kenya had a history of recombination.

Discussion
This study revealed novel polymorphic sites across the four immuno-dominant domains of the Pfs25 antigen. Most of these sites were dimorphic sites and involved Pfs25 D3. They resulted in 9 haplotypes that were unique ("private") not only to Homa Bay, Kisumu or Kisii highlands but also to western Kenya. The Pfs25 gene exhibited low nucleotide diversity and a high conservation index. Natural selection was identified to be the leading cause of these mutations with purifying selection purging on specific loci blocks. Additionally, inbreeding was established as an alternative driver for linked polymorphic loci in analysed sequences from each study site. There was high level of interbreeding in P. falciparum population from Homa Bay, Chulaimbo and Kisii zones.
Generally, low nucleotide diversity was observed in Pfs25 sequences from parasites population in western Kenya compared to Asia (0.11×10 −2 ), Thailand (0.09×10 −2 ), Cambodia (0.12×10 −2 ), India (0.31×10 −2 ), Brazil (0.18×10 −2 ) and Vietnam (0.26×10 −2 ) [49]. Chulaimbo sequences had high nucleotide variation compared to Homa Bay and Kisii. The observed variation between the two malaria-endemic sites and a close similarity between Homa Bay and Kisii implies that ongoing indoor residual spraying targeting malaria vectors within Homa Bay [50] may be directly or indirectly reducing the numbers of parasite clones or genetic diversity. Despite having high number of haplotypes across western Kenya, their observed number was within the set limit of the total number of segregating plus one (S+1), thus implying the population is under selective sweep or growth [48,51]. Apart from the dimorphic site 428 (codon V143) which had previously been reported in Pfs25 sequences from Cambodia [31,49], India [30,31,49] and now in Kisii highland, the remaining nine sites were novel. The amino acid change observed in this codon (V143G), however, did not correspond to documented V143A mutation following differences in base substitution at locus 428 (T/G as opposed to T/C) of the gene in one of the sequences from the Kisii area. None of the sequences from western Kenya had G131A mutations that had been previously observed in the parasite population from India and Cambodia [31]. However, three sequences from Chulaimbo study sites had mutations at a nearby codon V132I, and was observed at a frequency of 7.7%. Compared to its analogue in Plasmodium vivax, two mutated codons T130 and V132 identified in Pfs25 had been described in Pv25 in Asia [52,53].
The majority of the polymorphic sites we discovered were in the Pfs25 D3 region. This domain had previously been described to be highly immunogenic compared to others, and is also known for harboring binding sites of 1D2 and 4B7 human monoclonal antibodies used in standard membrane feeding assay studies of P. falciparum mosquito infectivity [14,25,54]. Among mutated codons in D3, codon change C110C and C115W targeted two of the 21 cysteine residues that are known to be highly conserved and function in maintaining the folding pattern of the Pfs25 polypeptide [14,31,55,56]. The presence of such silent and nonsynonymous mutation points to the possibility of a weak or strong selection pressure acting on nearby codons within the Pfs25 antigen [57]. If pronounced, such mutation may interfere with the folding pattern of the Pfs25 antigen, and thereby impact interaction between host antibodies elicited by Pfs25 TBV and modify interactions with other B-cell epitopes as observed with other proteins [58] and cysteine residues in the D2 region of Pfs47 [58].
As opposed to previous reports describing the absence of selective pressure acting on the Pfs25 gene [28], our studies suggest that natural selection may play a role in shaping allelic diversity of the Pfs25 in western Kenya. Specifically, purifying selection was suggested based on computation of the ratio of nonsynonymous and synonymous substitutions [43][44][45][46]. This was evident across all mutated codon since they displayed a significant Tajima's D value. The loci pairs under LD not only reaffirmed the history of selection but also confirmed inbreeding as another factor in the spread and sustenance of such mutations [59]. We speculate that selection pressure may be arising from unknown factor in protease-rich midgut of female Anopheline vectors.

Conclusion
The Pfs25 gene from the malaria-endemic and epidemic-prone region of western Kenya revealed varying levels of genetic diversity and novel haplotypes. Purifying selection and inbreeding were predicted as the cause and drivers for the observed variations. The low level of nucleotide diversity is an indicator that TBV based on Pfs25 sequences is less likely to be affected by antigenic variations. However, this study recommends continued monitoring of the Pfs25 gene from different malaria-prone regions including areas where clinical trials have been conducted. This will not only aid in unravelling new polymorphic sites that could have an effect on antibody binding especially on immunogenic D3 but also monitor the durability of the Pfs25 gene as a potential TBV.